Preprocessing of Missing Values Using Robust Association Rules

نویسنده

  • Arnaud Ragel
چکیده

1 I n t r o d u c t i o n The missing values problem is an old one for analysis tasks[8] [11]. The waste of da t a which can result f rom casewise deletion of missing values, obliges to propose alternatives approaches. A current one is to t ry to determine these values [9]. However, techniques to guess the missing values mus t be efficient, otherwise the complet ion introduces noise. Wi th the emergence of K D D for industrial databases, where missing values are inevitable, this problem has become a priori ty task [6] also requiring declarat ivi ty and interactivi ty during t rea tments . At the present t ime, t rea tments are often specific and internal to the methods , and do not offer such qualities. Consequently the missing values problem is still a challenging task of the KDD research agenda [6]. We have proposed in [14] the R A R algor i thm to correct the weakness of usual association rules algorithms[2] in mining databases with mult iple missing values. The efficiency of this a lgor i thm to extract quickly all the associations contained in such a database, allows to use it for the missing values problem. T h a t is what 1 Knowledge Discovery in Databases 2 Robust Association Rules

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Interactive and Understandable Methodto Treat Missing Values :

Many analysis tasks have to deal with missing values and some of them have developed speciic and internal treatments to guess them. In this paper we present the use of a new method, called MVC (Missing Values Completion), for this question: MVC is based on data preprocessing which gives prominence to understandable associations and gives the user a central part. Such qualities will allow to use...

متن کامل

Mining rules from an incomplete dataset with a high missing rate

The problem of recovering missing values from a dataset has become an important research issue in the field of data mining and machine learning. In this thesis, we introduce an iterative missing-value completion method based on the RAR (Robust Association Rules) support values to extract useful association rules for inferring missing values in an iterative way. It consists of three phases. The ...

متن کامل

Using Association Rules to Make Rule-based Classifiers Robust

Rule-based classification systems have been widely used in real world applications because of the easy interpretability of rules. Many traditional rule-based classifiers prefer small rule sets to large rule sets, but small classifiers are sensitive to the missing values in unseen test data. In this paper, we present a larger classifier that is less sensitive to the missing values in unseen test...

متن کامل

A Novel Algorithm for Association Rule Mining from Data with Incomplete and Missing Values

Missing values and incomplete data are a natural phenomenon in real datasets. If the association rules mine incomplete disregard of missing values, mistaken rules are derived. In association rule mining, treatments of missing values and incomplete data are important. This paper proposes novel technique to mine association rule from data with missing values from large voluminous databases. The p...

متن کامل

Algorithm for Missing Values Imputation in Categorical Data with Use of Association Rules

This paper presents new algorithm for missing values imputation in categorical data. The algorithm is based on using association rules and is presented in three variants. Experimental shows better accuracy of missing values imputation using new algorithm then using most common attribute value.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998